首页> 外文OA文献 >RECOME: a New Density-Based Clustering Algorithm Using Relative KNN Kernel Density
【2h】

RECOME: a New Density-Based Clustering Algorithm Using Relative KNN Kernel Density

机译:RECOmE:一种新的基于密度的相对KNN聚类算法   核密度

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Discovering clusters from a dataset with different shapes, density, andscales is a known challenging problem in data clustering. In this paper, wepropose the RElative COre MErge (RECOME) clustering algorithm. The core ofRECOME is a novel density measure, i.e., Relative $K$ nearest Neighbor KernelDensity (RNKD). RECOME identifies core objects with unit RNKD, and partitionsnon-core objects into atom clusters by successively following higher-densityneighbor relations toward core objects. Core objects and their correspondingatom clusters are then merged through $\alpha$-reachable paths on a KNN graph.Furthermore, we discover that the number of clusters computed by RECOME is astep function of the $\alpha$ parameter with jump discontinuity on a smallcollection of values. A jump discontinuity discovery (JDD) method is proposedusing a variant of the Dijkstra's algorithm. RECOME is evaluated on threesynthetic datasets and six real datasets. Experimental results indicate thatRECOME is able to discover clusters with different shapes, density and scales.It achieves better clustering results than established density-based clusteringmethods on real datasets. Moreover, JDD is shown to be effective to extract thejump discontinuity set of parameter $\alpha$ for all tested dataset, which canease the task of data exploration and parameter tuning.
机译:从形状,密度和比例不同的数据集中发现聚类是数据聚类中已知的难题。在本文中,我们提出了相对芯数(RECOME)聚类算法。 RECOME的核心是一种新颖的密度度量,即相对$ K $最接近的邻居内核密度(RNKD)。 RECOME以单元RNKD标识核心对象,并通过依次遵循对核心对象的高密度邻居关系,将非核心对象划分为原子簇。然后,通过KNN图上的$ \ alpha $可到达路径合并核心对象及其对应的原子簇。此外,我们发现RECOME计算的簇数是$ \ alpha $参数的阶跃函数,在小集合上具有跳跃间断价值。提出了一种跳变不连续发现(JDD)方法,该方法使用了Dijkstra算法的一种变体。在三个合成数据集和六个真实数据集上评估RECOME。实验结果表明,RECOME能够发现形状,密度和尺度不同的聚类,与在实际数据集上建立的基于密度的聚类方法相比,可实现更好的聚类结果。而且,JDD被证明可以有效地提取所有测试数据集的参数$ \ alpha $的跳跃不连续集,从而简化了数据探索和参数调整的任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号